95 research outputs found

    Sentiment Analysis: An Overview from Linguistics

    Get PDF
    Sentiment analysis is a growing field at the intersection of linguistics and computer science, which attempts to automatically determine the sentiment, or positive/negative opinion, contained in text. Sentiment can be characterized as positive or negative evaluation expressed through language. Common applications of sentiment analysis include the automatic determination of whether a review posted online (of a movie, a book, or a consumer product) is positive or negative towards the item being reviewed. Sentiment analysis is now a common tool in the repertoire of social media analysis carried out by companies, marketers and political analysts. Research on sentiment analysis extracts information from positive and negative words in text, from the context of those words, and the linguistic structure of the text. This brief survey examines in particular the contributions that linguistic knowledge can make to the problem of automatically determining sentiment

    Discourse Relations and Evaluation

    Get PDF
    We examine the role of discourse relations (relations between propositions) in the interpretation of evaluative or opinion words. Through a combination of Rhetorical Structure Theory or RST (Mann & Thompson, 1988) and Appraisal Theory (Martin & White, 2005), we analyze how different discourse relations modify the evaluative content of opinion words, and what impact the nucleus-satellite structure in RST has on the evaluation. We conduct a corpus study, examining and annotating over 3,000 evaluative words in 50 movie reviews in the SFU Review Corpus (Taboada, 2008) with respect to five parameters: word category (nouns, verbs, adjectives or adverbs), prior polarity (positive, negative or neutral), RST structure (both nucleus-satellite status and relation type) and change of polarity as a result of being part of a discourse relation (Intensify, Downtone, Reversal or No Change). Results show that relations such as Concession, Elaboration, Evaluation, Evidence and Restatement most frequently intensify the polarity of the opinion words, although the majority of evaluative words (about 70%) do not undergo changes in their polarity because of the relations they are a part of. We also find that most opinion words (about 70%) are positioned in the nucleus, confirming a hypothesis in the literature, that nuclei are the most important units when extracting evaluation automatically

    Cataphoric it and backgrounding from the point of view of coherence relations

    Get PDF
    Following Harris & Bates's (2002) observation that cataphora is allowed in subordinate backgrounded clauses, we examine backgrounding at the discourse level, making use of the nucleus-satellite distinction in Rhetorical Structure Theory (Mann & Thompson 1988). We extract examples of cataphoric it in an RST-annotated corpus and conclude that there is no strict correlation between cataphora and backgrounding, as cataphoric it appears in both nuclei and satellites. Using diagnostics in Ariel (1990), we propose that the occurrence of cataphora in nuclei is explained by: (i) cohesion, and (ii) first mention versus continuation of a discourse referent

    The Semantics of Evaluational Adjectives: Perspectives from Natural Semantic Metalanguage and Appraisal

    Get PDF
    We apply the Natural Semantic Metalanguage (NSM) approach (Goddard & Wierzbicka 2014) to the lexical-semantic analysis of English evaluational adjectives and compare the results with the picture developed in the Appraisal Framework (Martin & White 2005). The analysis is corpus-assisted, with examples mainly drawn from film and book reviews, and supported by collocational and statistical information from WordBanks Online. We propose NSM explications for 15 evaluational adjectives, arguing that they fall into five groups, each of which corresponds to a distinct semantic template. The groups can be sketched as follows: “First-person thought-plus-affect”, e.g. wonderful; “Experiential”, e.g. entertaining; “Experiential with bodily reaction”, e.g. gripping; “Lasting impact”, e.g. memorable; “Cognitive evaluation”, e.g. complex, excellent. These groupings and semantic templates are compared with the classifications in the Appraisal Framework’s system of Appreciation. In addition, we are particularly interested in sentiment analysis, the automatic identification of evaluation and subjectivity in text. We discuss the relevance of the two frameworks for sentiment analysis and other language technology applications

    RST Signalling Corpus: A Corpus of Signals of Coherence Relations

    Get PDF
    We present the RST Signalling Corpus (Das et al. in RST signalling corpus, LDC2015T10. https://catalog.ldc.upenn.edu/LDC2015T10, 2015), a corpus annotated for signals of coherence relations. The corpus is developed over the RST Discourse Treebank (Carlson et al. in RST Discourse Treebank, LDC2002T07. https://catalog.ldc.upenn.edu/LDC2002T07, 2002) which is annotated for coherence relations. In the RST Signalling Corpus, these relations are further annotated with signalling information. The corpus includes annotation not only for discourse markers which are considered to be the most typical (or sometimes the only type of) signals in discourse, but also for a wide array of other signals such as reference, lexical, semantic, syntactic, graphical and genre features as potential indicators of coherence relations. We describe the research underlying the development of the corpus and the annotation process, and provide details of the corpus. We also present the results of an inter-annotator agreement study, illustrating the validity and reproducibility of the annotation. The corpus is available through the Linguistic Data Consortium, and can be used to investigate the psycholinguistic mechanisms behind the interpretation of relations through signalling, and also to develop discourse-specific computational systems such as discourse parsing applications

    A corpus analysis of online news comments using the Appraisal framework

    Get PDF
    We present detailed analyses of the distribution of Appraisal categories (Martin and White, 2005) in a corpus of online news comments. The corpus consists of just over one thousand comments posted in response to a variety of opinion pieces on the website of the Canadian newspaper The Globe and Mail. We annotated all the comments with labels corresponding to different categories of the Appraisal framework. Analyses of the annotations show that comments are overwhelmingly negative, and that they favour two of the subtypes of Attitude (Judgment and Appreciation) over the third, Affect. The paper contributes a methodology for annotating Appraisal, and results that show the interaction of Appraisal with negation, the constructive (or not) nature of comments, and the level of toxicity found in them. The results show that highly opinionated language is expressed as an objective opinion (Judgement and Appreciation) rather than an emotional reaction (Affect). This finding, together with the interplay of evaluative language with constructiveness and toxicity in the comments, can be applied to the automatic moderation of comments

    Evaluation in Political Discourse Addressed to Women: Appraisal Analysis of Cosmopolitan\u27s Coverage of the 2014 US Midterm Elections

    Get PDF
    Before the US midterm elections of November 2014, the well-known women’s magazine Cosmopolitan decided to include politics in its contents. The editorial board stated that their aim was to encourage readers to vote and to be engaged with women’s rights advocay in the election process. To that end, Cosmopolitan created a new website, CosmoVotes, with content ranging from discussion of political issues to endorsement of specific candidates who were believed to advance women’s issues. Topics include labour rights, abortion, contraception, health, minimum wage and social equity. This paper evaluates the discourse of this new section of the Cosmopolitan website, together with readers’ responses, concentrating on evaluative language. In particular, we are concerned with differences between the editorial position and readers’ responses as viewed through the Appraisal framework (Martin & White, 2005), and the role that verbal processes play in the expression of evaluative meanings. The corpus used for the analysis consists of a selection of articles and readers’ opinions from CosmoVotes. The methodology is based on annotation of Appraisal features and processes related to the interpersonal dimension of meaning. Those features reveal how attitudes are evaluated and capture ideological positionings in this discourse. Our results show that CosmoVotes has special characteristics, such as a predominance of high intensification in the readers’ opinions, and strong negative judgements and expressions, while the magazine’s pieces on political issues are more nuanced and eschew intensification

    Big Data and Quality Data for Fake News and Misinformation Detection

    Get PDF
    Fake news has become an important topic of research in a variety of disciplines including linguistics and computer science. In this paper, we explain how the problem is approached from the perspective of natural language processing, with the goal of building a system to automatically detect misinformation in news. The main challenge in this line of research is collecting quality data, i.e., instances of fake and real news articles on a balanced distribution of topics. We review available datasets and introduce the MisInfoText repository as a contribution of our lab to the community. We make available the full text of the news articles, together with veracity labels previously assigned based on manual assessment of the articles’ truth content. We also perform a topic modelling experiment to elaborate on the gaps and sources of imbalance in currently available datasets to guide future efforts. We appeal to the community to collect more data and to make it available for research purposes
    • …
    corecore